Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Imputation algorithm for hybrid information system of incomplete data analysis approach based on rough set theory

PENG Li, ZHANG Haiqing, LI Daiwei, TANG Dan, YU Xi, HE Lei

Journal of Computer Applications 2021, 41 (3): 677-685. DOI: 10.11772/j.issn.1001-9081.2020060894

Abstract （406）

PDF （1135KB）（644）

Save

Concerning the problem of the poor imputation capability of the ROUgh Set Theory based Incomplete Data Analysis Approach (ROUSTIDA) for the Hybrid Information System (HIS) containing multiple attributes such as discrete (e.g., integer, string, and enumeration), continuous (e.g., floating) and missing attributes in the real-world application, a Rough Set Theory based Hybrid Information System for Missing Data Imputation Approach (RSHISMIS) was proposed. Firstly, according to the idea of decision attribute equivalence class partition, HIS was divided to solve the problem of decision rule conflict problem that might occurs after imputation. Secondly, a hybrid distance matrix was defined to reasonably quantify the similarity between objects in order to filter the samples with imputation capability and to overcome the shortcoming of ROUSTIDA that cannot handle with continuous attributes. Thirdly, the nearest-neighbor idea was combined to solve the problem of ROUSTIDA that it cannot impute the data with the same missing attribute in the case of conflict between the attribute values of non-discriminant objects. Finally, experiments were conducted on 10 UCI datasets, and the proposed method was compared with classical algorithms including ROUSTIDA, K Nearest Neighbor Imputation (KNNI), Random Forest Imputation (RFI), and Matrix Factorization (MF). Experimental results show that the proposed method outperforms ROUSTIDA by 81% in recall averagely and 5% to 53% in precision. Meanwhile, the method has the maximal 0.12 reduction of Normalized Root Mean Square Error (NRMSE) compared with ROUSTIDA. Besides, the classification accuracy of the method is 7% higher on average than that of ROUSTIDA, and is also better than those of the imputation algorithms KNNI, RFI and MF.

Reference | Related Articles | Metrics

Select

Mining algorithm of maximal fuzzy frequent patterns

ZHANG Haiqing, LI Daiwei, LIU Yintian, GONG Cheng, YU Xi

Journal of Computer Applications 2017, 37 (5): 1424-1429. DOI: 10.11772/j.issn.1001-9081.2017.05.1424

Abstract （630）

PDF （1047KB）（392）

Save

Combinatorial explosion and the effectiveness of mining results are the essential challenges of meaningful pattern extraction, a Maximal Fuzzy Frequent Pattern Tree Algorithm (MFFP-Tree) based on base-(second-order-effect) pattern structure and uncertainty consideration of items was proposed. Firstly, the fuzziness of items was analyzed comprehensively, the fuzzy support was given, and the fuzzy weight of items in the transaction data set was analyzed, the candidate item set was trimmed according to the fuzzy pruning strategy. Secondly, the database was scanned once to build FFP-Tree, and the overhead of pattern extraction was reduced based on fuzzy pruning strategy. The FFP-array structure was used to streamline the search method to further reduce the space and time complexity. The experimental results gained from the benchmark datasets reveal that the proposed MFFP-Tree has outstanding performance by comparing with PADS and FPMax ^* algorithms:the time complexity of the proposed algorithm is optimized by twice to one order of magnitude for different datasets, and the spatial complexity of the proposed algorithm is optimized by one order of magnitude to two orders of magnitude, respectively.

Reference | Related Articles | Metrics